Fourth Lab

Today’s goals. Learn about:

  1. Hypothesis Testing
  2. Likelihood theory (some initial results)

Workflow

  1. Consider examples related to theoretical concepts
  2. Using R to solve some basic questions and to understand important aspects

Let’s start

  1. Download the R code (It is briefly commented)
  2. You will find XXX where I will ask you to code a bit
  3. Part of the code after XXX, depends on successfully filling these line of code

Hypothesis testing

Basic concepts of hypothesis testing

Basic concepts

The null hypothesis for the parameter \(\theta\) is usually expressed as \[\text{H}_0: \theta = \theta_0\] Complementary to the choice of \(\text{H}_0\), we have to specify the alternative hypothesis \(\text{H}_1\), specifying the values of the parameter which becomes reasonable when \(\text{H}_0\) does not hold. Usually \(\text{H}_1\) may be:

  • \(\text{H}_1: \theta \neq \theta_0\) (two-sided alternative)
  • \(\text{H}_1: \theta > \theta_0\) or \(\text{H}_1: \theta < \theta_0\)(one-sided alternative)

Test for the mean difference: Anorexia example

Anorexia example: Hypothesis test formulation

  • Let’s consider again the example on Anorexia, illustrated in the previous lecture. We have two groups and we want ask if the mean weight change between the two groups can be considered as equal.

  • We could set up a test with the following aim: do the cognitive behavioral group therapy have mean weight change equal to the mean weight change of the control group? Or, do the cognitive behavioral group therapy have mean weight greater than the control group?

  • Under the same assumptions detailed above we aim to compare their means, \(\mu_1\) and \(\mu_2\) through the following hypothesis test (consider \(\alpha=0.05\))

Two-sided two-sample test One-sided two-sample test
\(\begin{cases} H_0: \mu_{1}-\mu_{2} = 0 \\ H_1: \mu_{1}-\mu_{2} \neq 0 \end{cases}\) \(\begin{cases} H_0: \mu_{1}-\mu_{2} = 0 \quad (\text{equivalently}\quad \mu_{1}-\mu_{2} \leq 0)\\ H_1: \mu_{1}-\mu_{2} > 0 \end{cases}\)
  • Assuming that \(\sigma^2_1=\sigma^2_2 = \sigma^2\), and then the test statistic, under \(\text{H}_0\) has the form \[T = \frac{\overline X - \overline Y}{S\sqrt{\frac{1}{n_1}+\frac{1}{n_2}}} \underset{H_{0}}{\sim}t_{n_1+n_2-2}.\]

Test for the mean difference: Anorexia example

Two-sided two-sample test

# The results obtained by using the t.test() 
# function for the two-sided two-sample test are
Anor <- read.table("http://stat4ds.rwth-aachen.de/data/Anorexia.dat", 
                   header=TRUE)
# Get difference post-pre treatment for the group cb and c 
cogbehav <- Anor$after[Anor$therapy == "cb"] - Anor$before[Anor$therapy == "cb"]
control <- Anor$after[Anor$therapy == "c"] - Anor$before[Anor$therapy == "c"]
res.two <- t.test(cogbehav, control, var.equal = TRUE)
res.two

    Two Sample t-test

data:  cogbehav and control
t = 1.676, df = 53, p-value = 0.09963
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.680137  7.593930
sample estimates:
mean of x mean of y 
 3.006897 -0.450000 

One-sided two-sample test

# The one-sided two-sample test are
res.one <- t.test(cogbehav, control, 
                  var.equal = TRUE, 
                  alternative = "greater")
res.one

    Two Sample t-test

data:  cogbehav and control
t = 1.676, df = 53, p-value = 0.04981
alternative hypothesis: true difference in means is greater than 0
95 percent confidence interval:
 0.003879504         Inf
sample estimates:
mean of x mean of y 
 3.006897 -0.450000 

Test for the mean difference: Anorexia example (by hand)

Two-sided two-sample test

n1 <- length(cogbehav)
n2 <- length(control)

s2 <- ((n1 - 1) * var(cogbehav) + (n2 - 1) * var(control))/(n1 + n2 - 2)
testStat <- (mean(cogbehav) - mean(control))/sqrt(s2 * (1/n1 +1/n2))
## two-sided 
round(2 * pt(testStat, df = n1+n2-2, lower = FALSE ), 5)
[1] 0.09963

One-sided two-sample test

## one-sided
round(pt(testStat, df = n1+n2-2, lower = FALSE ), 5)
[1] 0.04981

Conclusions

What are the conclusions? Take a look to https://doi.org/10.1080/00031305.2016.1154108

Test for the mean difference: Anorexia example

library(RColorBrewer)
plotclr <- brewer.pal(6, "YlOrRd")

curve(dt(x, n1 + n2 - 2), xlim = c(-5, 5), ylim = c(0, 0.4), 
      main = "p-values and rejection region", col = "blue", 
      lwd = 2, xlab = "x-y",  ylab = expression(t[13]),  yaxs="i")
cord.x <- c(qt(0.95, n1 + n2 - 2),seq(qt(0.95, n1 + n2 - 2), 5, 0.01), 5)
cord.y <- c(0, dt(seq(qt(0.95, n1 + n2 - 2), 5, 0.01), 13), 0)
polygon(cord.x, cord.y, col = plotclr[3], border = NA )


abline(v = res.one$statistic, lty = 2, lwd = 2, col = "red")
text(0, 0.2, paste("Accept", expression(H0)))
text(2.7, 0.08, paste("Reject", expression(H0)))
text(as.double(res.one$statistic) - 0.15, 0.02, "t", col = "red", cex = 1.2)

Test of the equality of the variances

F-test for equality of variances

  • Let us suppose to have two normally distributed populations. The test of the equality of the variances is useful for example when we want verify the assumption of equal variances before carrying out the test for the mean difference when the variances are unknown.

  • For two independent samples with sample sizes \(n_1\) and \(n_2\), respectively, from \(X\sim \mathcal{N}(\mu_1, \sigma^2_1)\) and \(Y \sim \mathcal{N}(\mu_2, \sigma^2_2)\) we want test: \[\begin{cases} {\rm H_0}:\sigma^2_1=\sigma^2_2 \\ {\rm H_1}:\sigma^2_1 \neq \sigma^2_2 \end{cases}\] Under \({\rm H_0}\) (equal variances) the test statistic \[T=\frac{S^2_1}{S^2_2}\sim F_{n_1-1, n_2-1}\] has an \(F\)-distribution with numerator degrees of freedom \(n_1-1\) and denominator degrees of freedom \(n_2-1\) (denoted with \(F_{n_1-1, n_2-1}\)). \(S^2_1\) and \(S^2_2\) are the unbiased sample variance estimators.

  • Then, by computing \(s^2_1\) and \(s^2_2\) and so \(t_0 = s^2_1/s^2_2\), we reject \({\rm H_0}\) if \(t_0<f_{n_1-1,n_2-1,\frac{\alpha}{2}}\) or if \(t_0>f_{n_1-1,n_2-1,1-\frac{\alpha}{2}}\).

\[ p-value = 2 \min(P(T < t_0), P(T > t_0)) \]

Test of the equality of the variances: Anorexia example

# Test for equality of variance using the var.test function
var.test(cogbehav, control, alternative = "two.sided") 

    F test to compare two variances

data:  cogbehav and control
F = 0.83696, num df = 28, denom df = 25, p-value = 0.6449
alternative hypothesis: true ratio of variances is not equal to 1
95 percent confidence interval:
 0.3805749 1.8090469
sample estimates:
ratio of variances 
         0.8369592 
# Test for equality of variance by hand
ratiovar <- var(cogbehav)/var(control) # Test statistic 
pv_bi <- 2 * min(pf(ratiovar, n1 - 1, n2 - 1, lower = FALSE), 
                 1 - pf(ratiovar, n1 - 1, n2 - 1, lower = FALSE))
pv_bi
[1] 0.6448602

Pearson’s chi-squared test

Pearson’s chi-squared test

Applications of Pearson’s chi-squared test

This is a class of test applied to sets of categorical data to evaluate whether any observed difference between the sets arose by chance. It is suitable for unpaired data from large samples. Pearson’s chi-squared test is used to assess three types of comparison:

  • Goodness of fit: Establishes whether an observed frequency differs from a theoretical distribution.

  • Homogeneity: Test if two or more sub-groups of a population share the same distribution of a single categorical variable.

  • Independence: Determines whether two categorical variables are associated.

In all the cases the test statistic is, under \({\rm H}_0\), distributed according to the Chi-square distribution with certain degrees of freedom.

Pearson’s chi-squared test: Test for independence

Test for independence

Question: is there a relationship (association) between two categorical variables?

  • We want carry out the hypothesis test \[\begin{cases}{\rm H_0}: \text{there is no relationship between the categorical variables} \\ {\rm H_1}: \text{there is a relationship between the categorical variables} \end{cases} \] \({\rm H_1}\) is not one-sided or two-sided: sometimes it is referred to as ‘many-sided’ since it allows any kind of difference. However, \({\rm H_1}\) says that there is a relationship but does not specify any particular kind of relationship.

  • To test \({\rm H_0}\), we compare the observed counts with the expected counts, that is the counts that we would expect if \({\rm H_0}\) were true. If the observed counts are far from the expected counts, there is evidence against \({\rm H_0}\).

Pearson’s chi-squared test: Test for independence

Test for independence

Then, the data are organised in a two-way table of (observed) counts or contingency table. Thus, let \(X\) and \(Y\) be the two categorical variables, with \(s\) and \(t\) categories, respectively.

Y/X \(x_1\) \(\cdots\) \(x_j\) \(\cdots\) \(x_t\) Total
\(y_1\) \(n_{11}\) \(\cdots\) \(n_{1j}\) \(\cdots\) \(n_{1t}\) \(n_{1\boldsymbol{\cdot}}\)
\(\vdots\) \(\vdots\) \(\ddots\) \(\vdots\) \(\ddots\) \(\vdots\) \(\vdots\)
\(y_i\) \(n_{i1}\) \(\cdots\) \(n_{ij}\) \(\cdots\) \(n_{it}\) \(n_{i\boldsymbol{\cdot}}\)
\(\vdots\) \(\vdots\) \(\ddots\) \(\vdots\) \(\ddots\) \(\vdots\) \(\vdots\)
\(y_s\) \(n_{s1}\) \(\cdots\) \(n_{sj}\) \(\cdots\) \(n_{st}\) \(n_{s\boldsymbol{\cdot}}\)
Total \(n_{\boldsymbol{\cdot} 1}\) \(\cdots\) \(n_{\boldsymbol{\cdot}j}\) \(\cdots\) \(n_{ \boldsymbol{\cdot}t}\) \(n_{\boldsymbol{\cdot}\boldsymbol{\cdot}}\)

where

  • \(n_{ij}\)= number of couples \((y_i, x_j)\), \(i=1, \ldots, s\), \(j=1, \ldots, t\) (absolute frequencies)
  • \(n_{i\boldsymbol{\cdot}}= \sum_{j=1}^{t}n_{ij}, i=1,\ldots, s\) (marginal frequencies of \(y_i\))
  • \(n_{\boldsymbol{\cdot}j}= \sum_{i=1}^{s}n_{ij}, k=1,\ldots, t\) (marginal frequencies of \(x_j\))
  • \(n_{\boldsymbol{\cdot}\boldsymbol{\cdot}}=n=\sum_{i=1}^{s}\sum_{j=1}^{t}n_{ij}\) (total sample size)

Pearson’s chi-squared test: Test for independence

Test for independence

The test statistic that allows the comparison between observed and expected counts is the Pearson’s chi-squared statistic, which is a measure of how far the observed counts in the two-way table are from the expected counts. It is always positive and it is zero only when the observed counts are exactly equal to the expected counts.

Then

  • Large value of the statistic are evidence against \({\rm H_0}\)

  • Small values of the statistic do not provide evidence against \({\rm H_0}\)

Note that even though \({\rm H_1}\) is many-sided, the Pearson’s chi-squared test is one-sided because any violation of \({\rm H_0}\) tends to produce a large value in the statistics.

Pearson’s chi-squared test: Test for independence

Test for independence

Then, under \(H_0\) leveraging the independence

\[P(X=x_i, Y=y_j)=P(X=x_i)\times P(Y=y_j),\]

for any \(i=1 , \ldots, s\) and \(j=1 , \ldots, t\), we can obtain the table of expected frequencies, denoted as \(n^*_{ij}\), under the \({\rm H_0}\)

Y/X \(x_1\) \(\cdots\) \(x_j\) \(\cdots\) \(x_t\)
\(y_1\) \(n^*_{11}\) \(\cdots\) \(n^*_{1j}\) \(\cdots\) \(n^*_{1t}\)
\(\vdots\) \(\vdots\) \(\ddots\) \(\vdots\) \(\ddots\)
\(y_i\) \(n^*_{i1}\) \(\cdots\) \(n^*_{ij}\) \(\cdots\) \(n^*_{it}\)
\(\vdots\) \(\vdots\) \(\ddots\) \(\vdots\) \(\ddots\) \(\vdots\)
\(y_s\) \(n^*_{s1}\) \(\cdots\) \(n^*_{sj}\) \(\cdots\) \(n^*_{st}\)

where \[n^*_{ij}=\frac{n_{i \boldsymbol{\cdot}} n_{\boldsymbol{\cdot}j}}{n}\]

Pearson’s chi-squared test: Test for independence

Test for independence

Then, the \(X^2\) Pearson statistic is

\[X^2=\sum_{}^{}\frac{(N_{ij}-n^*_{ij})^2}{n^*_{ij}}\underset{H_0}{\stackrel{\cdot}\sim} \chi^2_{(s-1)(t-1)}\]

Then, using the observed frequencies, \(n_{ij}\), and the expected frequencies, \(n^*_{ij}\), we can obtain the observed test statistics

\[t_0=\sum_{}^{}\frac{(n_{ij}-n^*_{ij})^2}{n^*_{ij}}\]

By fixing the significance level \(\alpha\), \(\text{we reject} \, \, {\rm H_0}\,\, \text{if}\,\, t_0>\chi^2_{(s-1)(t-1); 1-\alpha}\) or equivalently if \[t_0 \in \mathcal{R}_\alpha=(\chi^2_{(s-1)(t-1); 1-\alpha},+\infty) \]

Pearson’s chi-squared test: Test for independence

Example

Let us consider the following example to assess whether two qualitative characteristics are independent. The blood sample can be classified into four groups A, B, AB and O. To assess if the distribution of the blood sample is independent by the membership to the Italian macro-region (“South”, “Center”, “North”), we consider a sample of 760 subjects. The following table shows the joint distribution of the blood sample (\(X\)) and the membership to the macro-region (\(Y\)).

A B AB O
South 50 70 30 100
Center 114 30 10 100
Nord 116 27 13 100

Basically you observed the table of absoulte frequencies.

Then you can derive the relative frequencies \(f_{jk} = n_{jk}/n\), \(f_{j\cdot}=n_{j\cdot}/n\) and \(f_{\cdot k}=n_{\cdot k}/n\) and perform the following hypothesis test \[\begin{cases}H_0: f_{jk}=f_{j\cdot}f_{\cdot k} \,\,\,\forall j=1, \ldots, J, \,\, k=1, \ldots, K \\ H_1: \exists (j,k): f_{jk}\neq f_{j\cdot}f_{\cdot k} \end{cases}\]

Pearson’s chi-squared test: Test for independence example

Using chisq.test() function

n <- 760
obs_freq <- matrix(c(50, 70, 30, 100,
                     114, 30, 10, 100, 
                     116, 27, 13, 100),
                   3, 4, TRUE)
colnames(obs_freq) <- c("A", "B", "AB", "O")
rownames(obs_freq) <- c("South", "Central", "North")
chisq.test(obs_freq)

    Pearson's Chi-squared test

data:  obs_freq
X-squared = 70.99, df = 6, p-value = 2.561e-13

Manual calculation

# The results obtained by mean of the 
# chisq.test function are obtained as 
mx <- colSums(obs_freq)/n
my <- rowSums(obs_freq)/n
exp_freq <- outer(my, mx) * n
exp_freq
               A        B       AB         O
South   92.10526 41.77632 17.43421  98.68421
Central 93.57895 42.44474 17.71316 100.26316
North   94.31579 42.77895 17.85263 101.05263
chi2 <- sum((obs_freq - exp_freq)^2/exp_freq)
chi2
[1] 70.99012
pchisq(chi2, 
       (ncol(obs_freq) - 1) * (nrow(obs_freq) - 1), 
       lower.tail = FALSE)
[1] 2.561274e-13

Pearson’s chi-squared test: Test for independence example

Question

What if we remove the “South” macro-region. Are the conclusions of the test different?

chisq.test(obs_freq[-1,])

    Pearson's Chi-squared test

data:  obs_freq[-1, ]
X-squared = 0.55876, df = 3, p-value = 0.9058

Pearson’s chi-squared test: Goodness of fit test

Example: Testing for a Poisson distribution

The number of children in a household survey of 200 households is

Number of children 0 1 2 3 4 \(\geq\)5
Number of families 52 60 55 18 8 7

Goal: perform an hypothesis testing to assess whether it is reasonable to assume that \(X\) (number of children in a family) is distributed according to a \(Poisson(\lambda)\), with \(\lambda = 1.5\).

Pearson’s chi-squared test: Goodness of fit test

child <- 0:5
fam <- c(52, 60, 55, 18, 8, 7)
lambda <- 1.5
plot(0:5, fam/200, ylim = c(0,0.35))
segments(0:5,rep(0,5), 0:5, 
         c(dpois(0:4, lambda),ppois(4, lambda, lower = FALSE)))
c(dpois(0:4, lambda), ppois(4, lambda, lower = FALSE))

Pearson’s chi-squared test: Goodness of fit test

Example: Testing for a Poisson distribution

The null hypothesis \(H_0: Y \sim \mathcal{P}(\lambda = 1.5)\) translates into

\[H_0: \pi_0= 0.2231, \quad \pi_1 = 0.3347, \quad \pi_2 = 0.2510, \quad \pi_3 = 0.1255, \quad \pi_4 = 0.0471 \quad \pi_5 = 0.0186 \] where \(\pi_j = P(X = j)\), \(j = 0, \ldots, 4\), and \(\pi_5 = P(X\geq5)\).

The test is based on the comparison between the observed frequencies and the expected frequencies under \(H_0\) and it makes use of the chi-square test statistics. The needed quantities are reported in the table below

Number of Children 0 1 2 3 4 \(\geq 5\)
Observed frequencies (\(n_i\)) 52 60 55 18 8 7
Expected frequencies \((n \times \pi_i)\) 44.6260 66.9390 50.2043 25.1021 9.4133 3.7152
\(\frac{(n_i - n \pi_i)^2}{N \pi_i}\) 1.2185 0.7193 0.4581 2.0094 0.2122 2.9043
exp <- c(dpois(0:4, lambda), ppois(4, lambda, lower = FALSE)) * 200 
round(exp, 4)
[1] 44.6260 66.9390 50.2043 25.1021  9.4133  3.7152
chisq_el <- (fam-exp)^2/exp
round((fam-exp)^2/exp, 4)
[1] 1.2185 0.7193 0.4581 2.0094 0.2122 2.9043

Pearson’s chi-squared test: Goodness of fit test

Example: Testing for a Poisson distribution

The test is based on \[\chi^2 = \sum_{j = 1}^{5} \frac{(N_j - N \pi_j)^2}{N \pi_j}\] that for large samples \(\chi^2\stackrel{\cdot}\sim \chi^2_{K-1}\). The reject region is

\[\mathcal{R} = \{ \chi^2 > \chi^2_{K-1; 1-\alpha}\} = \{ \chi^2 > 11.0705\}\] The observed test statistic is \[\chi^2 = \sum_{j = 1}^{5} \frac{(n_j - n \pi_j)^2}{n \pi_j} = 7.521784\] This value does not belong to the reject region and we do not have evidence to discard \(H_0\). P-value, having denoted with \(Q = \chi^2_{K-1};\) is \[p-value = P(Q > \chi^2) = 1- P(Q \leq \chi^2) = 1-F_{Q}(7.521784) = 0.18 \]

Pearson’s chi-squared test: Goodness of fit test

chisq.obs <- sum(chisq_el)
chisq.obs 
[1] 7.521784
pchisq(chisq.obs, df = 5, lower=FALSE)
[1] 0.1846351
chisq.test(fam, p = exp/200)

    Chi-squared test for given probabilities

data:  fam
X-squared = 7.5218, df = 5, p-value = 0.1846

Likelihood theory: Maximum likelihood estimation

Likelihood inference

The likelihood function

Let \(\mathcal{F}\) a parametric statistical model for the data \(y\), where \(f(y;\theta)\) is the density or probability function and \(\theta \in \Theta\) is a p-dimensional parameter. By considering \(f(y;\theta)\) as function of \(\theta\) with \(y\) fixed to the observed value, then the likelihood function of \(\theta\) based on \(y\), \(L: \Theta \rightarrow \mathbb{R}^{+}\), is \[L(\theta) = L(\theta; y) \propto f_Y(y; \theta) \,\,, \] where the proportionality is related to a constant \(c(y)>0\) that does not depend on \(\theta\).

On the basis of the data \(y\), \(\theta \in \Theta\) is more likely than \(\theta' \in \Theta\) as an index of the data generating model if \(L(\theta) > L(\theta')\)

Note

  • \(L(\theta)\) is not a density function (on \(\Theta\)).
  • It is convenient to work with \(\ell(\theta) = \log L(\theta)\).
  • If we assume independent observations for \(y=(y_1, \ldots, y_n)\), then \(L(\theta) \propto \prod_{i=1}^{n} f_{Y_i}(y_i; \theta)\) and, up to an additive constant, \(\ell(\theta) = \sum_{i=1}^{n} \log f_{Y_i}(y_i; \theta)\).
  • The value \(\hat \theta\) that maximizes \(L(\theta)\) (or equivalently \(\ell(\theta)\)) is the maximum likelihood estimate.

Binomial model

Maximum likelihood estimation in the Binomial model

Let \(y=(y_{1},\ldots, y_{n})\) a sample of i.i.d. values from a Bernoulli distribution, \(Y \sim \mbox{Be}(p)\). Then the likelihood function is

\[\mathcal{L}(p) = \prod_{i=1}^{n} p^{y_i} (1-p)^{1-y_i} \] and the log-likelihood function takes the form \[\ell(p) = \sum_{i=1}^{n} y_i \log (p) + (1-y_i) \log (1-p)\] The maximum likelihood estimate, \(\hat p\), is the sample proportion \(\hat p =\frac{1}{n}\sum_{i=1}^{n}y_i\). To derive it we obtain the score function \[\ell_{\star}(p)=U(p)=\frac{\partial \ell(p)}{\partial p}= \sum_{i=1}^{n} \frac{y_i}{p} - \frac{1-y_i}{1-p} = \frac{\sum_{i=1}^{n}y_i}{p} - \frac{n- \sum_{i=1}^{n}y_i}{1-p}\] By equating at zero the score function, we obtain the maximum likelihood estimate

\[U(p) = 0 \implies \hat p = \frac{1}{n}\sum_{i=1}^{n} y_i \]

Binomial model

Example

Suppose generating a random sample \(y\) of size \(n=100\) from Bernoulli distribution with parameter \(p=0.6\). We want make inference on \(p\). Thus,

We want write a function taking in argument the parameter and the data and returns the log-likelihood. We can do it in two ways:

  • Leveraging the dbinom() function
llik_bin <- function(theta, data){
  sum(dbinom(x = data, size = 1, prob = theta, log = TRUE))
}
  • Writing the log-likelihood function manually
llik_bin2 <- function(theta, data){
  sum(data) * log(theta) + (length(data) - sum(data)) * log(1-theta)
}

Binomial model

Example

We can check if both the functions return the same value. Thus, for instance by fixing \(p=0.5\)

llik_bin(0.5,y)
[1] -69.31472
llik_bin2(0.5,y)
[1] -69.31472

The maximum likelihood estimate is in closed form (later we will see how carrying out the ML using numerical optimisation, which is particularly useful when the ML estimate cannot be obtained analitically), so

MLEp  <- mean(y)
MLEp
[1] 0.63
llik_bin(MLEp, y)
[1] -65.89557
pgrid <- seq(0.01, 0.99, by = 0.01)

pgrid[which(llik_bin2(pgrid, y) == max(llik_bin2(pgrid, y)))]
[1] 0.63

Binomial model

Example

Note that above we used the function built manually. This because that function is vectorized, while the first one should be vectorized before using it. Indeed

llik_bin(c(0.5, 0.75, 0.99), y) # Wrong
[1] -98.22045
llik_bin2(c(0.5, 0.75, 0.99), y) # Correct
[1]  -69.31472  -69.41686 -171.02447
# The results above is due to this operation
sum(dbinom(x = y[seq(1,100,3)], size = 1, prob = 0.5, log = TRUE)) + 
sum(dbinom(x = y[seq(2,100,3)], size = 1, prob = 0.75, log = TRUE)) + 
sum(dbinom(x = y[seq(3,100,3)], size = 1, prob = 0.99, log = TRUE))
[1] -98.22045
# However, we can vectorize the funtion llik_bin by means of 
llik_bin_v <- Vectorize(llik_bin, 'theta')
llik_bin_v(c(0.5, 0.75, 0.99), y) # Correct
[1]  -69.31472  -69.41686 -171.02447

Binomial model

Example

Then we can plot it, including two vertical lines denoting where the true parameter value and the maximum likelihood estimate are located

par(mfrow = c(1, 2))
curve(llik_bin2(theta=x, data = y), 0, 1, 
      xlab= expression(p), ylab = expression(l(p)),
      main = "Log-likelihood function")
abline(v = p, col = "red")
abline(v = MLEp, col = "blue")
legend("topleft", legend = c(expression(p), expression(hat(p))),
       col = c("red", "blue"), lty = c(1, 1))
curve(llik_bin_v(theta=x, data = y), 0, 1, 
      xlab= expression(p), ylab = expression(l(p)),
      main = "Log-likelihood function")
abline(v = p, col = "red")
abline(v = MLEp, col = "blue")
legend("topleft", legend = c(expression(p), expression(hat(p))),
       col = c("red", "blue"), lty = c(1,1 ))